Text Extraction from Skewed Images
نویسنده
چکیده
The extraction of text in an image is a classical problem in the computer vision. Extraction involves detection, localization, tracking, extraction, enhancement and recognition of the text from the given image. However variation of text due to difference in size, style, orientation, alignment, low image contrast and complex background make the problem of automatic text extraction extremely challenging. Text extraction requires binarization which leads to loss of significant information contained in gray scale images. The images may contain noise and have complex structure which makes the extraction more difficult. This paper proposes an algorithm which is insensitive to noise, skew and text orientation. It is free from artifacts that are usually introduced by thresholding using morphological operators. Examples are presented to illustrate the performance of proposed method. The text extraction system has been attempted over a corpus of three kinds of images and promising precision has been obtained.
منابع مشابه
An Adaptive Approach: Text Line Extraction from Multi-Skewed Hand Written Documents
Advancing technology has made document image processing an important feature in automation of office documentation. Digital filing system save space, paper and printing cost. The problem arises when document to be read is not placed correctly in scanner, which leads to the miss interpretation of document and increases the storage space .This paper deals with extraction of text from those skewed...
متن کاملDocument Analysis And Classification Based On Passing Window
In this paper we present Document analysis and classification system to segment and classify contents of Arabic document images. This system includes preprocessing, document segmentation, feature extraction and document classification. A document image is enhanced in the preprocessing by removing noise, binarization, and detecting and correcting image skew. In document segmentation, an algorith...
متن کاملSegmentation of Camera Captured Business Card Images for Mobile Devices
Due to huge deformation in the camera captured images, variety in nature of the business cards and the computational constraints of the mobile devices, design of an efficient Business Card Reader (BCR) is challenging to the researchers. Extraction of text regions and segmenting them into characters is one of such challenges. In this paper, we have presented an efficient character segmentation t...
متن کاملExtraction of Original Text Document from a Set of Degraded Text Documents from the Same Source
Information extraction is the task of extracting structured data from a degraded document. It includes data extraction such as text, image or graphics from the sources such as an image, video or documents. Text detection and extraction from the degraded document finds application in wide range of study. In this paper, an Optical Character Recognition less (OCR-less) method of obtaining an origi...
متن کاملExtract the Punjabi Word from Machine Printed Document Images
Extract the Punjabi Word from image has been a very intensive area of research during last decades due to it is wide range of solution to real world problems. A lot of work has been done in languages like Chinese, Arabic, Devnagari, Urdu and English. A neural network based Gurmukhi recognition system has been developed. Range free skew detection and correction algorithms for de-skewing Gurmukhi...
متن کامل